在本文中,我们建立了高效且取消耦合的学习动力学,因此,当由所有玩家在多人游戏中使用Perfect-Recall Inderfect Interfect Inderfection Formfortation Gartensive Games时,每个玩家的\ emph {触发后悔}会成长为$ o(\ log t t t t t t )$ $ t $重复播放。这比$ o(t^{1/4})$的先前最著名的触发regret键呈指数改进,并解决了Bai等人最近的一个开放问题。 (2022)。作为直接的结果,我们保证以$ \ frac {\ log log t} {t} $的接近速率以接近{粗相关的平衡}融合。基于先前的工作,我们的构造核心是关于从\ emph {polyenmial genter}衍生的固定点的更一般的结果,这是我们为\ emph {(粗)触发偏差函数建立的属性}。此外,我们的构造利用了凸壳的精制\ textit {遗憾电路},与先验保证不同 - 保留了Syrgkanis等人引入的\ emph {rvu属性}。 (NIPS,2015年);这种观察对基于CFR型遗憾的分解,在学习动态下建立近乎最佳的遗憾具有独立的兴趣。
translated by 谷歌翻译
最近的一项工作已经建立了未耦合的学习动力学,以至于当所有玩家在游戏中使用所有玩家时,每个玩家的\ emph {sorex} $ t $ recretitions在$ t $中增长了polygarithmarithm,这是$ t $的指数改进,比指数级的改进,比传统的保证在无缩写框架。但是,到目前为止,这些结果仅限于具有结构化策略空间的某些类别的游戏,例如正常形式和广泛形式的游戏。关于$ o(\ text {polylog} t)$遗憾界限是否可以为一般凸和紧凑型策略集获得的问题 - 这在经济学和多种系统中的许多基本模型中都发生 - 同时保留有效的策略更新是一种重要的问题。在本文中,我们通过建立$ o(\ log t)$ player后悔的第一个未耦合学习算法来回答这一点凸和紧凑的策略集。我们的学习动力基于对适当的\ emph {升起}空间的乐观跟随领导者的实例化,使用\ emph {self-condcordant正规器},这是特殊的,这不是可行区域的障碍。此外,我们的学习动力是可以有效地实现的,如果可以访问登录策略的近端甲骨文,从而导致$ o(\ log \ log \ log t)$ ter-ter-ter-tir-tir-tir-tir-tir-tir-tir-tir-tir-tir-tir-tir-tirceptimity;当仅假设仅对\ emph {Linear}优化Oracle访问时,我们还会给出扩展。最后,我们调整动力学以保证对抗性制度中的$ O(\ sqrt {t})$遗憾。即使在适用先前结果的特殊情况下,我们的算法也会改善最先进的遗憾界限,无论是依赖迭代次数还是对策略集的维度的依赖。
translated by 谷歌翻译
在非常大型游戏中近似NASH平衡的最新技术利用神经网络来学习大致最佳政策(策略)。一条有前途的研究线使用神经网络来近似反事实遗憾最小化(CFR)或其现代变体。 Dream是目前唯一的基于CFR的神经方法,它是免费模型,因此可以扩展到非常大型游戏的Dream,它在估计的遗憾目标上训练神经网络,由于从Monte Carlo CFR继承的重要性采样术语,该遗憾目标可能具有极高的差异(MCCFR)(MCCFR) )。在本文中,我们提出了一种无偏模的方法,该方法不需要任何重要的采样。我们的方法(Escher)是原则上的,并且可以保证在表格情况下具有很高概率的近似NASH平衡。我们表明,具有Oracle值函数的Escher表格版本的估计遗憾的差异明显低于具有Oracle值函数的结果采样MCCFR和表格Dream的结果。然后,我们表明,埃舍尔的深度学习版本优于先前的艺术状态 - 梦和神经虚拟的自我游戏(NFSP) - 随着游戏规模的增加,差异变得戏剧化。
translated by 谷歌翻译
考虑到人类行为的例子,我们考虑在多种代理决策问题中建立强大但人类的政策的任务。仿制学习在预测人类行为方面有效,但可能与专家人类的实力不符,而自助学习和搜索技术(例如,alphakero)导致强大的性能,但可能会产生难以理解和协调的政策。我们在国际象棋中显示,并通过应用Monte Carlo树搜索产生具有更高人为预测准确性的策略并比仿制政策更强大的kl差异,基于kl发散的正规化搜索策略。然后我们介绍一种新的遗憾最小化算法,该算法基于来自模仿的政策的KL发散规范,并显示将该算法应用于无按压外交产生的策略,使得在基本上同时保持与模仿学习相同的人类预测准确性的策略更强。
translated by 谷歌翻译
最近,Daskalakis,Fisselson和Golowich(DFG)(Neurips`21)表明,如果所有代理在多人普通和正常形式游戏中采用乐观的乘法权重更新(OMWU),每个玩家的外部遗憾是$ o(\ textrm {polylog}(t))$ the游戏的$重复。我们从外部遗憾扩展到内部遗憾并交换后悔,从而建立了以$ \ tilde {o}的速率收敛到近似相关均衡的近似相关均衡(t ^ { - 1})$。由于陈和彭(神经潜行群岛20),这实质上提高了以陈和彭(NEURIPS20)的相关均衡的相关均衡率,并且在无遗憾的框架内是最佳的 - 以$ $ $ to to polylogarithmic因素。为了获得这些结果,我们开发了用于建立涉及固定点操作的学习动态的高阶平滑的新技术。具体而言,我们确定STOLTZ和LUGOSI(Mach Learn`05)的无内部遗憾学习动态在组合空间上的无外部后悔动态等效地模拟。这使我们可以在指数大小的集合上交易多项式大型马尔可夫链的计算,用于在指数大小的集合上的(更良好的良好)的线性变换,使我们能够利用类似的技术作为DGF到接近最佳地结合内心遗憾。此外,我们建立了$ O(\ textrm {polylog}(t))$ no-swap-recreet遗憾的blum和mansour(bm)的经典算法(JMLR`07)。我们这样做是通过基于Cauchy积分的技术来介绍DFG的更有限的组合争论。除了对BM的近乎最优遗憾保证的阐明外,我们的论点还提供了进入各种方式的洞察,其中可以在分析更多涉及的学习算法中延长和利用DFG的技术。
translated by 谷歌翻译
在正常游戏中,简单,未耦合的无regret动态与相关的平衡是多代理系统理论的著名结果。具体而言,已知20多年来,当所有玩家都试图在重复的正常游戏中最大程度地减少其内部遗憾时,游戏的经验频率会收敛于正常形式相关的平衡。广泛的形式(即树形)游戏通过对顺序和同时移动以及私人信息进行建模,从而推广正常形式的游戏。由于游戏中部分信息的顺序性质和存在,因此广泛的形式相关性具有与正常形式的属性明显不同,而正常形式的相关性仍然是开放的研究方向。已经提出了广泛的形式相关平衡(EFCE)作为自然的广泛形式与正常形式相关平衡。但是,目前尚不清楚EFCE是否是由于未耦合的代理动力学而出现的。在本文中,我们给出了第一个未耦合的无regret动态,该动态将$ n $ n $ - 玩家的General-sum大型游戏收敛于EFCE,并带有完美的回忆。首先,我们在广泛的游戏中介绍了触发遗憾的概念,这扩展了正常游戏中的内部遗憾。当每个玩家的触发后悔低时,游戏的经验频率接近EFCE。然后,我们给出有效的无触发式算法。我们的算法在每个决策点在每个决策点上都会从每个决策点构建播放器的全球策略,从而将触发遗憾分解为本地子问题。
translated by 谷歌翻译
Diffusion models have achieved justifiable popularity by attaining state-of-the-art performance in generating realistic objects from seemingly arbitrarily complex data distributions, including when conditioning generation on labels. Unfortunately, however, their iterative nature renders them very computationally inefficient during the sampling process. For the multi-class conditional generation problem, we propose a novel, structurally unique framework of diffusion models which are hierarchically branched according to the inherent relationships between classes. In this work, we demonstrate that branched diffusion models offer major improvements in efficiently generating samples from multiple classes. We also showcase several other advantages of branched diffusion models, including ease of extension to novel classes in a continual-learning setting, and a unique interpretability that offers insight into these generative models. Branched diffusion models represent an alternative paradigm to their traditional linear counterparts, and can have large impacts in how we use diffusion models for efficient generation, online learning, and scientific discovery.
translated by 谷歌翻译
The polynomial kernels are widely used in machine learning and they are one of the default choices to develop kernel-based classification and regression models. However, they are rarely used and considered in numerical analysis due to their lack of strict positive definiteness. In particular they do not enjoy the usual property of unisolvency for arbitrary point sets, which is one of the key properties used to build kernel-based interpolation methods. This paper is devoted to establish some initial results for the study of these kernels, and their related interpolation algorithms, in the context of approximation theory. We will first prove necessary and sufficient conditions on point sets which guarantee the existence and uniqueness of an interpolant. We will then study the Reproducing Kernel Hilbert Spaces (or native spaces) of these kernels and their norms, and provide inclusion relations between spaces corresponding to different kernel parameters. With these spaces at hand, it will be further possible to derive generic error estimates which apply to sufficiently smooth functions, thus escaping the native space. Finally, we will show how to employ an efficient stable algorithm to these kernels to obtain accurate interpolants, and we will test them in some numerical experiment. After this analysis several computational and theoretical aspects remain open, and we will outline possible further research directions in a concluding section. This work builds some bridges between kernel and polynomial interpolation, two topics to which the authors, to different extents, have been introduced under the supervision or through the work of Stefano De Marchi. For this reason, they wish to dedicate this work to him in the occasion of his 60th birthday.
translated by 谷歌翻译
This paper presents the development of a system able to estimate the 2D relative position of nodes in a wireless network, based on distance measurements between the nodes. The system uses ultra wide band ranging technology and the Bluetooth Low Energy protocol to acquire data. Furthermore, a nonlinear least squares problem is formulated and solved numerically for estimating the relative positions of the nodes. The localization performance of the system is validated by experimental tests, demonstrating the capability of measuring the relative position of a network comprised of 4 nodes with an accuracy of the order of 3 cm and an update rate of 10 Hz. This shows the feasibility of applying the proposed system for multi-robot cooperative localization and formation control scenarios.
translated by 谷歌翻译
Steerable convolutional neural networks (CNNs) provide a general framework for building neural networks equivariant to translations and other transformations belonging to an origin-preserving group $G$, such as reflections and rotations. They rely on standard convolutions with $G$-steerable kernels obtained by analytically solving the group-specific equivariance constraint imposed onto the kernel space. As the solution is tailored to a particular group $G$, the implementation of a kernel basis does not generalize to other symmetry transformations, which complicates the development of group equivariant models. We propose using implicit neural representation via multi-layer perceptrons (MLPs) to parameterize $G$-steerable kernels. The resulting framework offers a simple and flexible way to implement Steerable CNNs and generalizes to any group $G$ for which a $G$-equivariant MLP can be built. We apply our method to point cloud (ModelNet-40) and molecular data (QM9) and demonstrate a significant improvement in performance compared to standard Steerable CNNs.
translated by 谷歌翻译